AITopics | trading policy

Collaborating Authors

trading policy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep reinforcement learning for optimal trading with partial information

Macrì, Andrea, Jaimungal, Sebastian, Lillo, Fabrizio

arXiv.org Machine LearningNov-4-2025

Reinforcement Learning (RL) applied to financial problems has been the subject of a lively area of research. The use of RL for optimal trading strategies that exploit latent information in the market is, to the best of our knowledge, not widely tackled. In this paper we study an optimal trading problem, where a trading signal follows an Ornstein-Uhlenbeck process with regime-switching dynamics. We employ a blend of RL and Recurrent Neural Networks (RNN) in order to make the most at extracting underlying information from the trading signal with latent parameters. The latent parameters driving mean reversion, speed, and volatility are filtered from observations of the signal, and trading strategies are derived via RL. To address this problem, we propose three Deep Deterministic Policy Gradient (DDPG)-based algorithms that integrate Gated Recurrent Unit (GRU) networks to capture temporal dependencies in the signal. The first, a one -step approach (hid-DDPG), directly encodes hidden states from the GRU into the RL trader. The second and third are two-step methods: one (prob-DDPG) makes use of posterior regime probability estimates, while the other (reg-DDPG) relies on forecasts of the next signal value. Through extensive simulations with increasingly complex Markovian regime dynamics for the trading signal's parameters, as well as an empirical application to equity pair trading, we find that prob-DDPG achieves superior cumulative rewards and exhibits more interpretable strategies. By contrast, reg-DDPG provides limited benefits, while hid-DDPG offers intermediate performance with less interpretable strategies. Our results show that the quality and structure of the information supplied to the agent are crucial: embedding probabilistic insights into latent regimes substantially improves both profitability and robustness of reinforcement learning-based trading strategies.

artificial intelligence, machine learning, reinforcement learning, (21 more...)

arXiv.org Machine Learning

2511.0019

Country: Europe (0.28)

Genre: Research Report > New Finding (0.86)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Q-Network (DQN) multi-agent reinforcement learning (MARL) for Stock Trading

Tidwell, John Christopher, Tidwell, John Storm

arXiv.org Artificial IntelligenceMay-8-2025

This project addresses the challenge of automated stock trading, where traditional methods and direct reinforcement learning (RL) struggle with market noise, complexity, and generalization. Our proposed solution is an integrated deep learning framework combining a Convolu-tional Neural Network (CNN) to identify patterns in technical indicators formatted as images, a Long Short-T erm Memory (LSTM) network to capture temporal dependencies across both price history and technical indicators, and a Deep Q-Network (DQN) agent which learns the optimal trading policy (buy, sell, hold) based on the features extracted by the CNN and LSTM. The CNN and LSTM act as sophisticated feature extractors, feeding processed information to the DQN, which learns the optimal trading policy (buy, sell, hold) through RL. W e trained and evaluated this model on historical daily stock data, using distinct periods for training, testing, and validation. Performance was assessed by comparing the agent's returns and risk on out-of-sample test data against baseline strategies, including passive buy-and-hold approaches. This analysis, along with insights gained from explainability techniques into the agent's decision-making process, aimed to demonstrate the effectiveness of combining specialized deep learning architectures, document challenges encountered, and potentially uncover learned market insights.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.03949

Genre: Research Report (0.64)

Industry:

Health & Medicine (1.00)
Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Finding Moving-Band Statistical Arbitrages via Convex-Concave Optimization

Johansson, Kasper, Schmelzer, Thomas, Boyd, Stephen

arXiv.org Artificial IntelligenceFeb-12-2024

We propose a new method for finding statistical arbitrages that can contain more assets than just the traditional pair. We formulate the problem as seeking a portfolio with the highest volatility, subject to its price remaining in a band and a leverage limit. This optimization problem is not convex, but can be approximately solved using the convex-concave procedure, a specific sequential convex programming method. We show how the method generalizes to finding moving-band statistical arbitrages, where the price band midpoint varies over time.

arbitrage, portfolio, trading policy, (14 more...)

arXiv.org Artificial Intelligence

2402.08108

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)

Add feedback

Neural Augmented Kalman Filtering with Bollinger Bands for Pairs Trading

Milstein, Amit, Deng, Haoran, Revach, Guy, Morgenstern, Hai, Shlezinger, Nir

arXiv.org Artificial IntelligenceSep-1-2023

Pairs trading is a family of trading techniques that determine their policies based on monitoring the relationships between pairs of assets. A common pairs trading approach relies on describing the pair-wise relationship as a linear Space State (SS) model with Gaussian noise. This representation facilitates extracting financial indicators with low complexity and latency using a Kalman Filter (KF), that are then processed using classic policies such as Bollinger Bands (BB). However, such SS models are inherently approximated and mismatched, often degrading the revenue. In this work, we propose KalmenNet-aided Bollinger bands Pairs Trading (KBPT), a deep learning aided policy that augments the operation of KF-aided BB trading. KBPT is designed by formulating an extended SS model for pairs trading that approximates their relationship as holding partial co-integration. This SS model is utilized by a trading policy that augments KF-BB trading with a dedicated neural network based on the KalmanNet architecture. The resulting KBPT is trained in a two-stage manner which first tunes the tracking algorithm in an unsupervised manner independently of the trading task, followed by its adaptation to track the financial indicators to maximize revenue while approximating BB with a differentiable mapping. KBPT thus leverages data to overcome the approximated nature of the SS model, converting the KF-BB policy into a trainable model. We empirically demonstrate that our proposed KBPT systematically yields improved revenue compared with model-based and data-driven benchmarks over various different assets.

pair trading, ss model, trading policy, (16 more...)

arXiv.org Artificial Intelligence

2210.15448

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Oceania > Australia (0.04)
North America > Canada (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

Deep Learning Statistical Arbitrage

Guijarro-Ordonez, Jorge, Pelger, Markus, Zanotti, Greg

arXiv.org Artificial IntelligenceOct-7-2022

Statistical arbitrage exploits temporal price differences between similar assets. We develop a unifying conceptual framework for statistical arbitrage and a novel data driven solution. First, we construct arbitrage portfolios of similar assets as residual portfolios from conditional latent asset pricing factors. Second, we extract their time series signals with a powerful machine-learning time-series solution, a convolutional transformer. Lastly, we use these signals to form an optimal trading policy, that maximizes risk-adjusted returns under constraints. Our comprehensive empirical study on daily US equities shows a high compensation for arbitrageurs to enforce the law of one price. Our arbitrage strategies obtain consistently high out-of-sample mean returns and Sharpe ratios, and substantially outperform all benchmark approaches.

artificial intelligence, machine learning, portfolio, (17 more...)

arXiv.org Artificial Intelligence

2106.04028

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Indonesia > Bali (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MetaTrader: An Reinforcement Learning Approach Integrating Diverse Policies for Portfolio Optimization

Niu, Hui, Li, Siyuan, Li, Jian

arXiv.org Artificial IntelligenceSep-1-2022

Portfolio management is a fundamental problem in finance. It involves periodic reallocations of assets to maximize the expected returns within an appropriate level of risk exposure. Deep reinforcement learning (RL) has been considered a promising approach to solving this problem owing to its strong capability in sequential decision making. However, due to the non-stationary nature of financial markets, applying RL techniques to portfolio optimization remains a challenging problem. Extracting trading knowledge from various expert strategies could be helpful for agents to accommodate the changing markets. In this paper, we propose MetaTrader, a novel two-stage RL-based approach for portfolio management, which learns to integrate diverse trading policies to adapt to various market conditions. In the first stage, MetaTrader incorporates an imitation learning objective into the reinforcement learning framework. Through imitating different expert demonstrations, MetaTrader acquires a set of trading policies with great diversity. In the second stage, MetaTrader learns a meta-policy to recognize the market conditions and decide on the most proper learned policy to follow. We evaluate the proposed approach on three real-world index datasets and compare it to state-of-the-art baselines. The empirical results demonstrate that MetaTrader significantly outperforms those baselines in balancing profits and risks. Furthermore, thorough ablation studies validate the effectiveness of the components in the proposed approach.

machine learning, metatrader, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3511808.3557363

2210.01774

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Model-based Reinforcement Learning for Predictions and Control for Limit Order Books

Wei, Haoran, Wang, Yuanbo, Mangu, Lidia, Decker, Keith

arXiv.org Artificial IntelligenceOct-8-2019

We build a profitable electronic trading agent with Reinforcement Learning that places buy and sell orders in the stock market. An environment model is built only with historical observational data, and the RL agent learns the trading policy by interacting with the environment model instead of with the real-market to minimize the risk and potential monetary loss. Trained in unsupervised and self-supervised fashion, our environment model learned a temporal and causal representation of the market in latent space through deep neural networks. We demonstrate that the trading policy trained entirely within the environment model can be transferred back into the real market and maintain its profitability. We believe that this environment model can serve as a robust simulator that predicts market movement as well as trade impact for further studies.

agent, environment model, rl agent, (13 more...)

arXiv.org Artificial Intelligence

1910.03743

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Transaction Costs-Aware Portfolio Optimization via Fast Lowner-John Ellipsoid Approximation

Shen, Weiwei (GE Global Research Center) | Wang, Jun (Alibaba Group)

AAAI ConferencesMar-6-2015

However, implementing such a strategy requires combining the VFI framework with policy parameterization, rebalancing continually as assets prices fluctuate, the proposed ADP method enjoys complementary advantages and therefore will lead to high or even infinite transaction of low approximation errors from VFI and high computational costs. Since then researchers have tried to address this issue efficiency from policy parameterization. Briefly, by solving Merton's portfolio problem in the presence the components from VFI pave the way for effectively parameterizing of transaction costs. Thereinto, the proportional transaction a complex policy in a high-dimensional space; costs model, as a suitable model for brokerage commissions the components from policy parameterization provide a and bid-ask spread costs, typifies the common situation pathway to efficiently evaluating the strategy and bypassing for normal investors (Brandt 2010; Cvitanic 2001; the issue of error amplification.

artificial intelligence, machine learning, transaction cost, (18 more...)

AAAI Conferences

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.30)

Add feedback